Continuous Top-k Monitoring on Document Streams
نویسندگان
چکیده
منابع مشابه
Top-k Context-Aware Queries on Streams
Preference queries have been largely studied for relational systems but few proposals exist for stream data systems. Most of the existing proposals concern the skyline, top-k or top-k dominating queries, coupled with the sliding-window operator. However, user preferences queries on data streams may be more sophisticated than simple skyline or top-k and may involve more expressive operations on ...
متن کاملOptimal Top-k Document Retrieval
Let D be a collection of D documents, which are strings over an alphabet of size σ, of total length n. We describe a data structure that uses linear space and and reports k most relevant documents that contain a query pattern P , which is a string of length p, in time O(p/ log σ n+k), which is optimal in the RAM model in the general case where lgD = Θ(logn), and involves a novel RAM-optimal suf...
متن کاملContinuous top-k queries over real-time web streams. (Evaluation de requêtes top-k continues à large-échelle)
The Web has become a large-scale real-time information system forcing us to revise both how to effectively assess relevance of information for a user and how to efficiently implement information retrieval and dissemination functionality. To increase information relevance, Real-time Web applications such as Twitter and Facebook, extend content and social-graph relevance scores with “real-time” u...
متن کاملTop-k/w publish/subscribe: A publish/subscribe model for continuous top-k processing over data streams
Continuous processing of top-k queries over data streams is a promising technique for alleviating the information overload problem as it distinguishes relevant from irrelevant data stream objects with respect to a given scoring function over time. Thus it enables filtering of irrelevant data objects and delivery of top-k objects relevant to user interests in real-time. We propose a solution for...
متن کاملFinding top-k elements in data streams
Identifying the most frequent elements in a data stream is a well known and difficult problem. Identifying the most frequent elements for each individual, especially in very large populations, is even harder. The use of fast and small memory footprint algorithms is paramount when the number of individuals is very large. In many situations such analysis needs to be performed and kept up to date ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2017
ISSN: 1041-4347,1558-2191,2326-3865
DOI: 10.1109/tkde.2017.2657622